45 research outputs found
ADNet: Lane Shape Prediction via Anchor Decomposition
In this paper, we revisit the limitations of anchor-based lane detection
methods, which have predominantly focused on fixed anchors that stem from the
edges of the image, disregarding their versatility and quality. To overcome the
inflexibility of anchors, we decompose them into learning the heat map of
starting points and their associated directions. This decomposition removes the
limitations on the starting point of anchors, making our algorithm adaptable to
different lane types in various datasets. To enhance the quality of anchors, we
introduce the Large Kernel Attention (LKA) for Feature Pyramid Network (FPN).
This significantly increases the receptive field, which is crucial in capturing
the sufficient context as lane lines typically run throughout the entire image.
We have named our proposed system the Anchor Decomposition Network (ADNet).
Additionally, we propose the General Lane IoU (GLIoU) loss, which significantly
improves the performance of ADNet in complex scenarios. Experimental results on
three widely used lane detection benchmarks, VIL-100, CULane, and TuSimple,
demonstrate that our approach outperforms the state-of-the-art methods on
VIL-100 and exhibits competitive accuracy on CULane and TuSimple. Code and
models will be released on https://github.com/ Sephirex-X/ADNet.Comment: ICCV2023 accepte
The Structure Transfer Machine Theory and Applications
Representation learning is a fundamental but challenging problem, especially
when the distribution of data is unknown. We propose a new representation
learning method, termed Structure Transfer Machine (STM), which enables feature
learning process to converge at the representation expectation in a
probabilistic way. We theoretically show that such an expected value of the
representation (mean) is achievable if the manifold structure can be
transferred from the data space to the feature space. The resulting structure
regularization term, named manifold loss, is incorporated into the loss
function of the typical deep learning pipeline. The STM architecture is
constructed to enforce the learned deep representation to satisfy the intrinsic
manifold structure from the data, which results in robust features that suit
various application scenarios, such as digit recognition, image classification
and object tracking. Compared to state-of-the-art CNN architectures, we achieve
the better results on several commonly used benchmarks\footnote{The source code
is available. https://github.com/stmstmstm/stm }
GPA-3D: Geometry-aware Prototype Alignment for Unsupervised Domain Adaptive 3D Object Detection from Point Clouds
LiDAR-based 3D detection has made great progress in recent years. However,
the performance of 3D detectors is considerably limited when deployed in unseen
environments, owing to the severe domain gap problem. Existing domain adaptive
3D detection methods do not adequately consider the problem of the
distributional discrepancy in feature space, thereby hindering generalization
of detectors across domains. In this work, we propose a novel unsupervised
domain adaptive \textbf{3D} detection framework, namely \textbf{G}eometry-aware
\textbf{P}rototype \textbf{A}lignment (\textbf{GPA-3D}), which explicitly
leverages the intrinsic geometric relationship from point cloud objects to
reduce the feature discrepancy, thus facilitating cross-domain transferring.
Specifically, GPA-3D assigns a series of tailored and learnable prototypes to
point cloud objects with distinct geometric structures. Each prototype aligns
BEV (bird's-eye-view) features derived from corresponding point cloud objects
on source and target domains, reducing the distributional discrepancy and
achieving better adaptation. The evaluation results obtained on various
benchmarks, including Waymo, nuScenes and KITTI, demonstrate the superiority of
our GPA-3D over the state-of-the-art approaches for different adaptation
scenarios. The MindSpore version code will be publicly available at
\url{https://github.com/Liz66666/GPA3D}.Comment: Accepted by ICCV 202
The structure transfer machine theory and applications
Representation learning is a fundamental but challenging problem, especially when the distribution of data is unknown. In this paper, we propose a new representation learning method, named Structure Transfer Machine (STM), which enables feature learning process to converge at the representation expectation in a probabilistic way. We theoretically show that such an expected value of the representation (mean) is achievable if the manifold structure can be transferred from the data space to the feature space. The resulting structure regularization term, named manifold loss, is incorporated into the loss function of the typical deep learning pipeline. The STM architecture is constructed to enforce the learned deep representation to satisfy the intrinsic manifold structure from the data, which results in robust features that suit various application scenarios, such as digit recognition, image classification and object tracking. Compared with state-of-the-art CNN architectures, we achieve better results on several commonly used public benchmarks